conditional flow
Energy Guided Geometric Flow Matching
Zweig, Aaron, Zhang, Mingxuan, Azizi, Elham, Knowles, David
A useful inductive bias for temporal data is that trajectories should stay close to the data manifold. Traditional flow matching relies on straight conditional paths, and flow matching methods which learn geodesics rely on RBF kernels or nearest neighbor graphs that suffer from the curse of dimensionality. We propose to use score matching and annealed energy distillation to learn a metric tensor that faithfully captures the underlying data geometry and informs more accurate flows. We demonstrate the efficacy of this strategy on synthetic manifolds with analytic geodesics, and interpolation of cell
- Europe > Netherlands > South Holland > Leiden (0.05)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
Low-Field Magnetic Resonance Image Quality Enhancement using a Conditional Flow Matching Model
Nguyen, Huu Tien, Eldaly, Ahmed Karam
This paper introduces a novel framework for image quality transfer based on conditional flow matching (CFM). Unlike conventional generative models that rely on iterative sampling or adversarial objectives, CFM learns a continuous flow between a noise distribution and target data distributions through the direct regression of an optimal velocity field. We evaluate this approach in the context of low-field magnetic resonance imaging (LF-MRI), a rapidly emerging modality that offers affordable and portable scanning but suffers from inherently low signal-to-noise ratio and reduced diagnostic quality. Our framework is designed to reconstruct high-field-like MR images from their corresponding low-field inputs, thereby bridging the quality gap without requiring expensive infrastructure. Experiments demonstrate that CFM not only achieves state-of-the-art performance, but also generalizes robustly to both in-distribution and out-of-distribution data. Importantly, it does so while utilizing significantly fewer parameters than competing deep learning methods. These results underline the potential of CFM as a powerful and scalable tool for MRI reconstruction, particularly in resource-limited clinical environments.
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > United Kingdom > England > Devon > Exeter (0.04)
Aleatoric Uncertainty Medical Image Segmentation Estimation via Flow Matching
Van Nguyen, Phi, Trinh, Ngoc Huynh, Nguyen, Duy Minh Lam, Nguyen, Phu Loc, Tran, Quoc Long
Quantifying aleatoric uncertainty in medical image segmentation is critical since it is a reflection of the natural variability observed among expert annotators. A conventional approach is to model the segmentation distribution using the generative model, but current methods limit the expression ability of generative models. While current diffusion-based approaches have demonstrated impressive performance in approximating the data distribution, their inherent stochastic sampling process and inability to model exact densities limit their effectiveness in accurately capturing uncertainty. In contrast, our proposed method leverages conditional flow matching, a simulation-free flow-based generative model that learns an exact density, to produce highly accurate segmentation results. By guiding the flow model on the input image and sampling multiple data points, our approach synthesizes segmentation samples whose pixel-wise variance reliably reflects the underlying data distribution. This sampling strategy captures uncertainties in regions with ambiguous boundaries, offering robust quantification that mirrors inter-annotator differences. Experimental results demonstrate that our method not only achieves competitive segmentation accuracy but also generates uncertainty maps that provide deeper insights into the reliability of the segmentation outcomes.
- Asia > Vietnam > Hanoi > Hanoi (0.04)
- Asia > India > Andhra Pradesh > Bay of Bengal (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
Streaming Flow Policy: Simplifying diffusion/flow-matching policies by treating action trajectories as flow trajectories
Jiang, Sunshine, Fang, Xiaolin, Roy, Nicholas, Lozano-Pérez, Tomás, Kaelbling, Leslie Pack, Ancha, Siddharth
Recent advances in diffusion$/$flow-matching policies have enabled imitation learning of complex, multi-modal action trajectories. However, they are computationally expensive because they sample a trajectory of trajectories: a diffusion$/$flow trajectory of action trajectories. They discard intermediate action trajectories, and must wait for the sampling process to complete before any actions can be executed on the robot. We simplify diffusion$/$flow policies by treating action trajectories as flow trajectories. Instead of starting from pure noise, our algorithm samples from a narrow Gaussian around the last action. Then, it incrementally integrates a velocity field learned via flow matching to produce a sequence of actions that constitute a single trajectory. This enables actions to be streamed to the robot on-the-fly during the flow sampling process, and is well-suited for receding horizon policy execution. Despite streaming, our method retains the ability to model multi-modal behavior. We train flows that stabilize around demonstration trajectories to reduce distribution shift and improve imitation learning performance. Streaming flow policy outperforms prior methods while enabling faster policy execution and tighter sensorimotor loops for learning-based robot control. Project website: https://streaming-flow-policy.github.io/
- Europe > United Kingdom > North Sea > Southern North Sea (0.05)
- Asia > South Korea > Daegu > Daegu (0.04)
- North America > United States > Massachusetts (0.04)
- (3 more...)
Metriplectic Conditional Flow Matching for Dissipative Dynamics
Metriplectic conditional flow matching (MCFM) learns dissipative dynamics without violating first principles. Neural surrogates often inject energy and destabilize long-horizon rollouts; MCFM instead builds the conservative-dissipative split into both the vector field and a structure preserving sampler. MCFM trains via conditional flow matching on short transitions, avoiding long rollout adjoints. In inference, a Strang-prox scheme alternates a symplectic update with a proximal metric step, ensuring discrete energy decay; an optional projection enforces strict decay when a trusted energy is available. We provide continuous and discrete time guarantees linking this parameterization and sampler to conservation, monotonic dissipation, and stable rollouts. On a controlled mechanical benchmark, MCFM yields phase portraits closer to ground truth and markedly fewer energy-increase and positive energy rate events than an equally expressive unconstrained neural flow, while matching terminal distributional fit.
- North America > United States > New York > Monroe County > Rochester (0.04)
- Europe > Switzerland (0.04)
Flow Matching Policy Gradients
McAllister, David, Ge, Songwei, Yi, Brent, Kim, Chung Min, Weber, Ethan, Choi, Hongsuk, Feng, Haiwen, Kanazawa, Angjoo
Flow-based generative models, including diffusion models, excel at modeling continuous distributions in high-dimensional spaces. In this work, we introduce Flow Policy Optimization (FPO), a simple on-policy reinforcement learning algorithm that brings flow matching into the policy gradient framework. FPO casts policy optimization as maximizing an advantage-weighted ratio computed from the conditional flow matching loss, in a manner compatible with the popular PPO-clip framework. It sidesteps the need for exact likelihood computation while preserving the generative capabilities of flow-based models. Unlike prior approaches for diffusion-based reinforcement learning that bind training to a specific sampling method, FPO is agnostic to the choice of diffusion or flow integration at both training and inference time. We show that FPO can train diffusion-style policies from scratch in a variety of continuous control tasks. We find that flow-based models can capture multimodal action distributions and achieve higher performance than Gaussian policies, particularly in under-conditioned settings.
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia > Middle East > Jordan (0.04)
An Introduction to Flow Matching and Diffusion Models
Holderrieth, Peter, Erives, Ezra
Diffusion and flow-based models have become the state of the art for generative AI across a wide range of data modalities, including images, videos, shapes, molecules, music, and more. This tutorial provides a self-contained introduction to diffusion and flow-based generative models from first principles. We systematically develop the necessary mathematical background in ordinary and stochastic differential equations and derive the core algorithms of flow matching and denoising diffusion models. We then provide a step-by-step guide to building image and video generators, including training methods, guidance, and architectural design. This tutorial is ideal for machine learning researchers who want to develop a principled understanding of the theory and practice of generative AI.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- (2 more...)
On the Closed-Form of Flow Matching: Generalization Does Not Arise from Target Stochasticity
Bertrand, Quentin, Gagneux, Anne, Massias, Mathurin, Emonet, Rémi
Modern deep generative models can now produce high-quality synthetic samples that are often indistinguishable from real training data. A growing body of research aims to understand why recent methods -- such as diffusion and flow matching techniques -- generalize so effectively. Among the proposed explanations are the inductive biases of deep learning architectures and the stochastic nature of the conditional flow matching loss. In this work, we rule out the latter -- the noisy nature of the loss -- as a primary contributor to generalization in flow matching. First, we empirically show that in high-dimensional settings, the stochastic and closed-form versions of the flow matching loss yield nearly equivalent losses. Then, using state-of-the-art flow matching models on standard image datasets, we demonstrate that both variants achieve comparable statistical performance, with the surprising observation that using the closed-form can even improve performance.
- Europe > France (0.04)
- Asia > Middle East > Syria > Daraa Governorate > Dar'a (0.04)
BinauralFlow: A Causal and Streamable Approach for High-Quality Binaural Speech Synthesis with Flow Matching Models
Liang, Susan, Markovic, Dejan, Gebru, Israel D., Krenn, Steven, Keebler, Todd, Sandakly, Jacob, Yu, Frank, Hassel, Samuel, Xu, Chenliang, Richard, Alexander
Binaural rendering aims to synthesize binaural audio that mimics natural hearing based on a mono audio and the locations of the speaker and listener. Although many methods have been proposed to solve this problem, they struggle with rendering quality and streamable inference. Synthesizing high-quality binaural audio that is indistinguishable from real-world recordings requires precise modeling of binaural cues, room reverb, and ambient sounds. Additionally, real-world applications demand streaming inference. To address these challenges, we propose a flow matching based streaming binaural speech synthesis framework called BinauralFlow. We consider binaural rendering to be a generation problem rather than a regression problem and design a conditional flow matching model to render high-quality audio. Moreover, we design a causal U-Net architecture that estimates the current audio frame solely based on past information to tailor generative models for streaming inference. Finally, we introduce a continuous inference pipeline incorporating streaming STFT/ISTFT operations, a buffer bank, a midpoint solver, and an early skip schedule to improve rendering continuity and speed. Quantitative and qualitative evaluations demonstrate the superiority of our method over SOTA approaches. A perceptual study further reveals that our model is nearly indistinguishable from real-world recordings, with a $42\%$ confusion rate.
- North America > United States > New York > Monroe County > Rochester (0.04)
- North America > Canada (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Vision (0.93)
- Information Technology > Artificial Intelligence > Natural Language (0.89)
- Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.62)
Model alignment using inter-modal bridges
Foundation models have demonstrated remarkable performance across modalities such as language and vision. However, model reuse across distinct modalities (e.g., text and vision) remains limited due to the difficulty of aligning internal representations. Existing methods require extensive paired training data or are constrained to specific domains. We introduce a semi-supervised approach for model alignment via conditional flow matching. The conditional flow between latent spaces of different modalities (e.g., text-to-image or biological-to-artificial neuronal activity) can be learned in two settings: ($1$) solving a (balanced or unbalanced) optimal transport problem with an inter-space bridge cost, and ($2$) performing memory-efficient alignment using labelled exemplars. Despite being constrained by the original models' capacity, our method--under both settings--matches downstream task performance of end-to-end trained models on object recognition and image generation tasks across MNIST, ImageNet, and \cite{majaj2015simple} datasets, particularly when labelled training data is scarce ($<20\%$). Our method provides a data-efficient solution for inter-modal model alignment with minimal supervision.
- North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)